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INTRODUCTION 

Optimal  control  system  analysis  starts  with  the  character- 
ization of  systems  by  state  variables  and  the  design  of  systems 
by  state-space  techniques.   In  general,  optimal  control  problems 
are  viewed  as  variational  problems.   There  are  many  possible  var- 
iational methods  for  maximizing  or  minimizing  a  functional  over  a 
function  space.   The  range  is  from  classical  methods  in  the  cal- 
culus of  variations  to  numerical  and  successive  approximation 
techniques  of  experimental  or  model  systems.  Among  the  commonly 
used  methods  in  control  system  design  are: 

1.  The  calculus  of  variations. 

2.  The  maximum  principle. 
;  3«  Dynamic  programming. 

In  all  cases  the  goal  is  to  find  the  optimum- control  law  or 
sequence  such  that  a  given  function  of  performance  indices  is 
maximized  or  minimized.   It  will  be  found  that  the  common  prop- 
erty of  all  three  methods  is  the  use  of  variational  principles. 
Each  of  these  methods  is  related  to  well-known  formulations  in 
classical  mechanics:   the  first  to  the  Euler-Lagrange  equation; 
the-  second  to  the  Hamilton  principle;  and  the  third  to  the 
Hamilton-Jacobi  theory.   The  maximum  principle  employs  a  more  or 
less  direct  procedure  of  the  calculus  of  variations,  whereas  dy- 
namic programming,  while  still  following  the  variational  prin- 
ciples, uses  the  recurrence  relationship  or  the  algorithm  of 
partial  differential  equations. 


THE  CALCULUS  OF  VARIATIONS 


1.   Outline 


The  calculus  of  variations  is  that  branch  of  the  calculus 
which  is  concerned  with  optimization  problems  under  more  general 
conditions  than  those  considered  in  the  ordinary  theory  of  max- 
ima and  minima  ( extreme ).   There  are  three  fundamental  problems 
in  the  calculus  of  variations,  the  Lagrange  problem,  the  Mayer 
problem,  and  the  Bolza  problem. 

1)  The  Lagrange  Problem,   The  Lagrange  problem  in  one  inde- 
pendent variable  is  concerned  with  the  determination  of  a  func- 
tion m(t)  which  minimizes  the  integral  of  a  given  function.   So 
the  minimum  integral  control  problem  belongs  to  this  type.   It 
can  be  expressed  In  equations.   Let  the  following  be  given: 

a)  A  set  of  differential  equations 

Xj^  =  fi(x,  m,  t),  or  more  generally 

fi    i  2:  >  k  '  E*    *)  =0  (1-1) 

i  =  1,  2,  .  .  . ,  n 

X  is  n-vector,  m  is  r-vector. 

b)  A  set  of  initial  conditions 

x^(tQ)  =  a^  (1-2) 

i  =  1,  2,  .  .  .,  n. 

c)  A  criterion  function 

ft,. 

I  =      P(x,  m,  t)dt  (1-3) 

where  P(x,  m,  t)  is  a  continuous  function  of  the  arguments. 


Now  the  question  is  to  determine  the  function  in(t)  which  mini- 
mizes I  over  all  functions  of  m(t),  subject  to  the  conditions 
given  in  equations  (l-l)  and  (1-3). 

2)  The  Mayer  Problem.   The  Mayer  problem  is  concerned  with 
determination  of  a  function  m(t)  which  minimizes  a  given  func- 
tion evaluated  at  the  end  point,  containing  some  variables  whose 
final  values  are  unspecified  in  advance.  Usually  time-optimal 
problems  are  classified  as  Mayer  problems.   This  problem  may  be 
stated  as  follows: 

a)  Given  a  set  of  differential  equations 

^i  ~  -^i^Z'  S»  t) 
OP         y^   (x,  X,  m,  t)  =  0  (1-1^.) 

•X.       ''~       J-  m        ^  f         m    9    m    m        Tl  • 

b)  Given  a  set  of  initial  conditions 

Xi(to)  =  a^  (1-5) 

i  =  1,  2,  .  .  .  n. 

c)  Given  a  set  of  final  conditions 

x.(t^)  =  bj 
where  j  belongs  to  some  subset  of  the  integers  1,  2,  .  .  .,  n, 
and  t|»  is  unspecified. 

d)  Given  a  criterion  function 


I  =  G(x,  m,  t) 


^f 


^0 


(1-7) 


Determine  the  function  m(t)  which  minimizes  I  over  all  functions 
of  m(t)  subject  to  the  conditions  given  in  equations  (1-ij.), 
(1-5),  and  (1-6). 


3)  The  Bolza  Problem.   The  Bolza  problem  is  concerned  with 
determination  of  a  function  m(t)  which  minimizes  the  integral  of 
a  function  plus  a  function  evaluated  at  the  end  point  and  con- 
tains some  variables  whose  final  values  are  unspecified  in  ad- 
vance. An  optimal  control  system  subject  to  certain  constraints 
can  be  studied  as  a  problem  of  Bolza  type.   This  is  the  most 
general  case,  and  the  Mayer  problem  and  the  Lagrange  problem  are 
the  special  cases  of  this  type.   This  problem  may  be  stated  as 
follows: 

a)  Given  a  set  of  differential  equations  as  described  in 

equation  (1-1),  or  (l-ij.)  . 

b)  Given  a  set  of  initial  conditions  as  described  in 
equation  (1-2),  or  (1-5). 

c)  Given  a  set  of  finial  conditions  as  described  in 
equation  (1-6) . 

d)  Given  a  criterion  function 

'    ^f    /^f 


I  =  G(x,  m,  t) 


/tf    X    " 

P(x,  m,  t)dt  .  (1-8) 


+ 

to  "to 

Determine  the  function  ra(t)  which  minimizes  I  over  all  functions 
of  m(t)  subject  to  the  conditions  given  in  equations  (1-1), 
(1-4),  and  (1-6). 

It  is  easy  to  see  that  if  G(x,  m,  t)  is  equal  to  zero  in 
equation  (1-8),  it  becomes  equation  (1-3);  if  P(x,  m,  t)  is 
equal  to  zero  in  equation  (1-8),  it  becomes  equation  (1-7). 
However,  some  auxiliary  variables  can  always  be  introduced  which 
transform  a  Lagrange  problem  into  a  Bolza  problem  or  a  Mayer 
problem,  and  vice  versa.   In  other  words,  these  problems  can  be 


converted  from  one  to  another.  Although  there  are  many  optimal 
control  problems  which  do  not  seem  to  belong  to  any  of  these 
formulations,  there  is  always  some  mathematical  artifice  which 
will  reduce  the  initial  scheme  to  one  of  those  considered  above. 


2.   Basic  Principle  of  the  Calculus  of 
Variation  in  Minimization  Problems 


We  consider  the  fixed-end-point  system  and  the  movable-end- 
point  system. 

1)  Fixed-end-point  System.   Consider  the  problem  of  mini- 
mizing the  integral 

tf 


I 


I  =     F(x,  X,  t)dt     ;  "-{  (1-9) 

Mliere  x  =  x(t)  is  a  twice-differentiable  function  and  satisfies 
the  conditions  x(to)  =  xq  and  x{tf)  =  x^,  and  P  is  a  scalar  con- 
tinuous  function  of  scalar  arguments  x,  x,  t.   Determine  the 
function  x(t)  which  minimizes  the  integral  of  equation  (1-9). 

If  one  interprets  this  geometrically,  the  problem  is  to  de- 
termine the  curve  x(t)  connecting  the  points  (xq,  t^)  and 
(xf,  tf)  such  that  the  integral  along  the  curve  of  some  given 
function  F(x,  x,  t)  is  a  minimum.   In  control  terminology,  if 
x(t)  is  the  output  of  a  controlled  system,  then  the  integral 
given  in  equation  (1-9)  describes  a  measure  of  the  overall  per- 
formance of  the  system.   The  criterion  of  performance  is  that 
this  integral  is  minimal.   Let  x(t)  be  the  minimizing  function 
and  x(t)  be  a  neighboring  function  of  x(t).   The  x(t)  and  x(t) 


are  related  by 

x(t)  =  x(t)  +e'r(t)  (1-10) 

x(t)  =  i{t)  +€7(t)  (1-11) 

where  e  is  a  small  parameter,  and  <f(t)  is  an  arbitrary  differ- 
entiable  function  for  which 

'r(to)  =nitf)    =  0  .  '  (1-12)  ^ 

Since  the  end  points  are  assumed  to  be  fixed  as  shown  in 
Fig.  (1-1).   The  condition  given  in  equation  (1-12)  insures  that 

x(to)  =  x(to)  =  xq 
and        x(t^)  =  x(t^)  =  x^  . 


x(t) 


Pig.  (1-1).   Optimum  trajectory  with  fixed-end  points. 


The  vertical  deviation  of  any  curve  x{t)  from  the  actual 
minimizing  curve  is  given  by  6'((t),  as  illustrated  in  Pig.  (l-l). 
No  matter  which  *f  (t)  is  chosen,  the  minimizing  function  x(t) 
is  a  member  of  that  family  for  the  choice  of  parameter  value 
e  =  0. 

Replacing  x  and  x  in  equation  (1-9),  respectively,  by  x  and 
X  yields 


1(6) 


=  J   P(x  +  et  ,  X  +  6*f  , 


t)dt 


(1-13) 


By  Taylor's  series  expansion,  equation  (1-13)  becomes 

f^t  .  .        aF     .  aF   1    n  n    ^^^ 

i(€)  =     F(x,x,t)  +67—  +  6Y~  +  —  (e^'r  — ^ 
>'4.   '^  ax      ax   21       zx.'^ 


to 


2.-  -:  .  ,2:;2  :_)]dt 


(l-li+) 


ax ax        3x 

The  necessary  condition  for  I  to  be  a  maximuin  or  a  minimuin  is 
that 

.,  31(e) 


d€ 


=  0  . 


(1-15) 


€=  0 

This  leads  to  the  condition  that 


r 

^0 


^f  hF      . aF 

{  fl —  +  /j  — -)dt  =  0 


(1-16) 


ax    bx 


Equation  (l-l6)  is  obtained  by  omitting  the  terms  e  ,  €  »  •  •  •> 
in  equation  (1-li^.). 

Integrating  by  parts  the  second  term  of  this  integral 


J<"tf  .  ap      ap  |tf   / 

to    ^^        ^^Itn   A, 


^0     ^0 


'f   d  aF 

-— (-7)clt 

dt  ax 


7  — 

ax 


tf 


-0 


/"^f   d  aF 

1  — ^(— )dt 

0 


-/t^  •   dt  »x 


(1-17) 


=  0  by  the  condition  of  equation  (1-12), 


The  equation  (l-l6)  reduces  to 


I 


8 


.  _  -!-(-,)  dt  =  0  (1-18) 

to   ^^   3t  ax  -^ 


Since  equation  (l-l8)  must  hold  for  all  f(   ,  the  necessary 

condition  for  I  to  be  an  extremum  is 

3P   d   3F  ,     ^ 

—  =  0  (1-19) 

ax   dt  3X 

This  second-order  equation  is  known  as  the  Euler-Lagrange 
differential  equation  and  its  solution  gives  the  minimizing 
function  on  the  integral  of  the  problem  provided  the  minimum 
exists. 

In  the  case  of  multidimensional  functions,  the  criterion 
integral  of  equation  (1-9)  becomes 

I  =      F(x,  X,  t)dt  (1-20) 

where  x  -   x(t)  is  an  n-vector  function  of  t,each  component  func- 


tion being  twice  differentiable. 

x(t)  =  (xi(t),  X2(t),  .  .  .,  Xn(t))  (1-21) 

The  Euler-Lagrange  differential  equation  for  the  multi- 
dimensional case  is 

d 

V-,P (V;F)  =0  (1-22) 

dt 

BF     ap       aP  t 
where       V^P  =  ( ,  ,  .  •  •,  )  (-23) 


and 


a  X-1   3Xq  3  X 


3P   ap        ap  , 
Vi  P  =  {-:-,   __,...,__)  (1-21+) 

ax]^  ax2        3Xj^ 


2)  Movable -end-point  syateiti.   Now  we  discuss  the  minimiza- 
tion problem  with  the  end  point  of  the  trajectory  lying  on  a 
curve  X  =  c(t),  as  shown  in  Fig.  (1-2). 


X(t) 


X€St^ 


tf  v**^ 


Fig.  (1-2).   Optimal  trajectory  with 
movable- end-point. 


Let  x(t)  be  the  function  which  minimizes  the  criterion  in- 
tegral given  in  equation  (1-9).   The  end  point  of  the  trajectory 
is  assumed  to  lie  on  the  curve  x  =  c(t),  as  shown  in  Fig.  1-2). 
Asstime  that  x(t)  is  a  neighboring  function  of  x(t).   The  rela- 
tionship between  x(t)  and  x(t)  is  the  same  as  equations  (1-10) 
and  (1-11) .   The  arbitrary  function  '^(t)  satisfies  the  initial 
condition 

n  (to)  =  0  (1-25) 

but  the  final  condition  is  yet  undefined. 

By  replacing  x  and  x  in  equation  (1-9),  respectively,  by 
X  and  X,  and  the  upper  limit  of  integration  by  t^  +  6  5t^,  one 
obtains 
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=1 


1(e)    =    J  F(x  +  €T,    X  +  67    ,    t)dt 

to 


J       P(x+€T,    x+€i,t)dt+J  F(x+€fr  ,x+ ei,t)dt 


to  to 


=r 


(e^  —  +€7  — )dt  +  €5t^ 
to     ^^     *^ 


-i. 


•  P  (x(f  )  +  €7(f  ),  i(f  )  +€(^),  f  )dt 
tf 


ap  ap 

(7 —  +  'T  — )dt  +  est^. 

ax  ay 

0 


tr     ^^      2>X 


P  [x(tf)  +€'r(tf),  x(tf)  +  €*?(tf),  t^  }  dt 
tf 
'to 


/tf  ap   .  ap 

=  €     (  */—  +  >!  — )dt  +  €5t^P(t^)  (1-26) 

-/•i-„   ax    ax 


where   t^  ^  f  -  t^  +  €5t|.. 

As  65t^  is  very  small,  we  can  replace  ^  by  t^..   The  neces- 
sary condition  for  I  to  be  an  extremum  is 
dl(e) 


=  0 


ae 

Therefore 

J'  f    ZT*'  iiT? 

(*/—+*?  — )dt  +  6t^P(t^)  =  0  (1-27) 

to    ^^     *^ 

.  ap 
Integrating  the  term  fj  —  as  before,  and  rearranging 

ax 


11 


J 


^0 


'ax   dt  ax 


dx 


-f(to)  - 


to 


+  P(tf)  5tf  =  0  (1-28) 

Examination  of  the  end  condition  shown  in  Pig.  (1-2)  leads 
to  the  following  relation: 

x€5t|.  +  ef(tf)  =c(tf)€6tf 

or         x5tj.  +  f  (t^)  =  c(t^)  5t^  (1-29) 

Substituting  equation  (1-29)  into  equation  (1-28)  jields 

('^^      .ap   d  ap  .     f       .  -I  P  I  1 

»ff {_)Jdt  +  ]  P(t|.)  +  [c(t^)  -  x(tf)J—    [5tf 

Ao    ^^   dt  ax  •'      I  X  It^  J 


ap 


-  f(to)  — 


ax 


==  0 


(1-30) 


to 


Prom  boundary  condition  f  (tQ)  =0  and  5t|.  is  arbitrary, 
equation  (1-30)  leads  to 

tf  ,  ap   d  ap 


J'^t      ,  ap   d  ap  . 
^   '^  ax   dt  ax  '' 


dt  =  0 


(1-31) 


and 


jp(tf)  4-  [6(tf)  -  i(t^)  -^   jj  =0         (1-32) 


Since  »!  ^  0  in  equation  (1-31),  it  follows  that 
ap   d  ap 
ax   dt  ax 


(1-33) 


This  is  exactly  the  same  form  discussed  in  case  1,  the  fixed-end- 
point  system. 

Prom  equation  (1-32) 


12 


F(t^)  =  [x(t^)  -  c(t^)1  — 


ax 


(1-31^) 


For  a  special  case,  if  the  end  point  curve  x  =  c(t)  is  a  hori- 
zontal straight  line  and 

c(tf)  =  0 
equation  (l-3il-}  becomes 


P(t^)  =  x(t|.)  — 

3X 


(1-35) 


tf 


For  movable-end-point  system,  the  conditions  of  equations  (1-33) 
and  (l-3i4.)  must  be  hold,  and  equation  (1-34)  is  referred  to  as 
the  transversality  condition. 

3.  Application  of  the  Calculus  of  Variation 

Minimum- integral  control  problems  are  to  be  studied  as  an 
application  of  the  variational  calculus  to  the  optimum  design 
of  control  processes.   Consider  a  control  characterized  by 

x(t)  =  g(x,  m)  (1-36) 

where  x  and  m  are  analytic  scalar  functions  of  t,  and  at  t  =  tQ 

x(to)  =  Xq  (1-37) 

Determine  the  optimum  control  signal  m(t)  which  minimizes  the 
integral-criterion  function 


■I 


tf 


P(x,  m)dt  (1-38) 

to 

Let  x  and  m  be  a  pair  of  functions  which  yield  the  minimum  of 
equation  (1-38).   Then  the  neighboring  functions  x  and  m   may  be 


13 


expressed  as 

x  =  X  +  ef  (1-39) 

m  =  m  +  ef  (l-U-O) 

where  f  and  f  are  arbitrary  differentiable  functions  of  t, 
defined  for  tQ  ^  t  ^  t^,  and  e  is  a  small  parameter.   Replacing 

X  and  m  in  equation  (1-38),  respectively,  by  x  and  m,  yields 
the  neighboring  function 


I  (€)  = 

J 


F(x  +  €»f  ,  m  +  €f  )dt  (1-1|1) 

By  Taylor's  series  expansion,  one  obtains 

tf  f'^t       ap         ap 

(7—  +  f  — )dt 

'0 


/tf  /i 

I(€)    =  P(x,   m)dt  +    6] 


'to  -^+--  ^^  ^^" 


2         3 

+   terras  €    ,  €    ,    .    .    .         (l-ij-2) 


2   3 
We  omit  the  high-order  infinitesimal  terms  €  ,t  > 

since  the  minimum  of  the  integral  occurs  vjhen 

31 


a€ 


=  0  (1-43) 

€=0 


I 


The   necessaxT"  condition  for  I   to  be   an  extremvim  is 

ap  2>P 

(r|_  +  ^  — )dt  =  0  (l-iOi-) 

tQ  ^x  3»J 

Replacing  x  and  m  in  equation  (1-3^),  respectively,  by  x  and  m 
yields 

X  +  €^  =  g(x  +  €7  ,  m  +  6f  )  (l-ij.5) 

By  Taylor's  series  expansion 


Ik 


3g 


dg 


X  +  €  7  =  g(x,m)    +€(f  —  +    f  — )    +   terms  6    ,  €^,    .    .    .    (l-i|6) 

ax  dm 

Proin  equation   (1-1^.6),   we  know 

X  =  g(x,   m)  U-kl) 

ag  dg 

9X  dm 

Solving  for    from  equation  (I-I4.8),  we  obtain 

•t  -  7  (ag/ax) 

f=- .  (1-49) 

dg/am 

Substituting  equation  (l-/^.9)  into  equation  (1-14]-)  yields 

r^t         2>F  .  dg  aF/am 

7  —  +  ( *?  -  n  —) dt  =  0  (1-50) 


Integrating  by  parts  yields 


I 


*f  aF/dm    .  2>F/dm  Tf 

•r  dt  =  »7 


to      ^g/^*^ 


/*f      d 

A„        dl 


aF/anj 
■( )    dt 


g/3mlto       Aq       ^*  ag/dm 


f^f       d     ap/am 

^0  *^*  agAiTi 

Since  x(to)    =  Xq,      ^(tQ)    =  0   and 
aF/am 


)  dt 


>g/am 


=  0 


Equation  (1-50)  becomes 

*f       aF  /*f     d     »PAm 

^   —  dt 

to  »x 


I 


r^i   d 


>g/»i»i 


-i 


*f     9g     aF/ara 
( )dt   -     I         m—     

to        ^^     ag/»nj 


(1-51) 


(1-52) 


dt  =  0 


and  can  be  rewritten  as 
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ap      ag 

f^^(   ax       am       ag     aP/am  >|  /^*f     d     aP/am 

h  dt  -  »|  — ( )dt  =  0 

-^t  ^gA^^  ^^     ^g/atn  J  -/^  dt    ag/am 


and  furthermore,  as 

ap    ag      ap    ag 


^r'   ax    am       am     ax       d     aP/am  ' 

( ) 


£( 


tQ  I  ^g/am  dt  ag/am  , 


♦I  dt  =  0  (1-53) 


Since  equation  (1-53)  must  hold  for  all  *[   ,    the  Euler-Lagrange 
differential  equation  for  this  optimum  control  problem  evolves 
as 
ap  ag   3P  ag 

ax  am   am  ax     d  ap/am 

-  — ( )  =  0  (1-5I4.) 


ag/am  dt   ag/axa 

The  solution  of  this  differential  equation  subject  to  the 
boundary  conditions  specified  in  equations  (1-37)  and  (1-52) 
gives  the  optimum  control  m(t)  for  the  process. 
Por  a  special  case 

X  =  g(x,  ra)  =  m 
the  integral  criterion  function  becomes 

tf 


■i 


P(x,    x)dt 

to 


ag  ag 

and  —  =1,      —  =  0. 

am  ax 


The  Euler-Lagrange   equation  for  this   special  case   reduces   to 

ap       d     ap 

(—)    =  0 

ax       dt  ax 
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dF   d  ap 

or        (— )  =  0 

ax   dt  ax 


with  the  boundary  conditions 

=  0 


aP 
x(tQ)  =  Xq  and  —7 


2>^  tf 

This  is  the  same  result  as  equation  (1-33). 

The  same  process  extends  to  a  multidimensional  control  pro- 
cess by  replacement  of  the  scalar  function  by  a  multidimensional 
vector. 

Previous  discussions  assume  that  both  control  signals  and 
state  variables  are  subject  to  no  constraints.   But  in  realis- 
tic optim\im  controls  for  physical  processes,  constraints  on 
control  signals  or  state  variables  must  be  taken  into  considera- 
tion.  For  solving  those  optimum  control  problems  with  con- 
straints, we  usually  use  the  method  of  Lagrange  multipliers 
which  will  be  discussed  in  the  section  on  dynamic  programming. 

The  optimum  design  of  control  systems  by  the  calculus  of 
variations  leads  to  a  two-point  boundary  value  problem.   Analyt- 
ical solutions  for  such  problems  are  possible  only  in  special 
cases.   So  trial-and-error  techniques  must  be  resorted  to.   These 
techniques  first  guess  a  value  for  the  missing  initial  condition, 
and  integrate  numerically  the  Euler-Lagrange  and  constraining 
equations.   There  must  be  a  difference  between  the  resulting 
final  condition  and  the  specified  condition.   So  the  trial-and- 
error  process  must  be  repeated  several  times  until  the  value  of 
the  final  condition  obtained  in  this  way  agrees  sufficiently 
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close  with  the  specified  value  of  the  final  condition.   Because 
of  this  shortcoming,  the  classical  calculus  of  variations  is 
less  attractive  in  the  design  of  optimal  control  systems. 

THE  MAXIMUM  PRINCIPLE 

1.   Outline 

The  maxim-urn  principle  of  Pontryagin  is  generally  regarded 
as  most  promising  for  the  solution  of  complex  problems.   The 
Russian  mathematician  Pontryagin  discovered  that  the  control 
problem  can  precede  the  calculus  of  variations  by  relating  the 
Pontryagin  function  and  the  Hamiltonian  function.   In  fact, 
Pontryagin' 3  maximum  principle  bears  a  close  relation  to  the 
classical  problem  of  Mayer.   It  differs,  however,  from  the  Mayer 
problem  in  one  respect.   In  the  Mayer  problem,  every  control 
signal  is  unbounded.   In  Pontryagin 's  work,  on  the  other  hand, 
the  values  of  the  control  signal  are  bounded.   A  control  vector 
which  satisfies  the  constraint  conditions  is  referred  to  as  an 
admissible  control  vector. 

As  in  the  calculus  of  variations,  Pontryagin 's  maximum 
principle  can  be  used  on  three  basic  problems. 

1)  The  Minimum- time  Control  Problem.   The  minimum-time  con- 
trol problem  may  be  stated  as  the  determination  of  an  admissible 
control  vector  m  so  that  the  process  is  taken  from  s   specified 
initial  state  vector  Xq  ^o  a  desired  final  state  vector  x^  in 
the  shortest  possible  time. 
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2)  The  Terminal- control  Problem,   The  terminal-control 
problem  is  the  determination  of  an  admissible  control  vector  m 
such  that,  in  a  given  time  interval  T,  the  system  is  taken  from 
an  initial  state  _Xq  into  a  state  in  which  one  or  a  combination 
of  the  state  variables  becomes  as  large  as  possible  or  as  small 
as  possible,  and  the  remaining  state  variables  have  fixed  values 
within  physical  limits. 

3)  The  Minimum- integral  Control  Problem.   The  minimum- 
integral  control  problem  may  be  stated  as  the  determination  of 

an  admissible  control  vector  m  in  such  a  manner  that  the  integral 

/tf 
I  =      P(x,  m,  t)dt 

reduces  to  a  minimum  during  the  time  of  movement  t^  -  tg. 

These  three  modes  of  optimtun  control  can  be  transformed  to 
an  optimization  with  respect  to  co-ordinates  or  state  variables, 
which  is  referred  to  as  "generalized  mode  of  optimum  control". 
The  transformation  is  carried  out  by  "invariant  embedding",  a 
procedure  of  increasing  the  dimensionality  of  the  state  vector 
by  adding  a  new  co-ordinate.   The  reduction  of  the  three  modes 
of  optimum  control  to  the  generalized  mode  is  presented  as  follows. 

4)  Reducing  a   Minimum- time  Control  Problem  to  the  Gener- 
alized Mode  of  Optimal  Control.   Consider  the  n'^'^-order  control 
process  characterized  by 

^i  =  ^i(iS»  B'    t)  (2-1) 

i  =  1,  2,  . .  . ,  n 

This  problem  implies  the  minimization  of  the  time  required 
to  move  the  process  from  an  initial  state  to  a  desired  final 
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state,  i.e. , 

min  I   dT  (2-2) 

By  introducing  a  new  state  variable  Xj^+i(t)  such  that 

^+l(t)  =  1  (2-3) 

Xn+i(t)  =  j   dx  (2-4) 

The  optimal  control  prohleiti  reduces  to  determination  of  an  ad- 
missible control  vector  m  so  that  the  new  state  variable  Xj^+i 
is  minimized. 

5)  Reducing  a   Terminal-control  Problem  to  the  Generalized 
Mode  of  Optimal  Control.   Consider  the  n^^^-order  control  pro- 
cess characterized  by  equation  (2-1).   Now  introduce  a  new  state 
variable 

^n+1^*^  ~  ^  (xi(t),  X2(t),  .  .  . 
with  the  initial  value  given  by 

Xn+i(0)  =  P  [xi(0),  X2(0),  .  .  . 
In  vector  notation 

Xn+i(t)  =  P  (x(t)) 

Xn+i(0)  =  P  (x(0)) 
7^^      n'-  ap(z) 

k=l   axjj 

Derivatives  of  the  other  state  variables  are  given  by 

equation  (2-1).   The  terminal  control  problem  is  now  reduced  to 

the  problem  of  optimization  with  respect  to  the  new  co-ordinate 

Xjj+2»  ^^   ^^^   final  moment  of  time. 

6)  Reducing  a^  Minimum- integral  Control  Problem  to  the  Gen- 
eralized Mode  of  Optimal  Control.   Consider  the  n^'^-order  control 


>  ^(t)] 

(2-5) 

'  ^(0)) 

(2-6) 

(2-7) 

(2-8) 

(2-9) 
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process  characterized  by  equation  (2-1).   By  introducing  a  new 
state  variable  Xjj+3^(t)  defined  by 

Xn+i(t)  =  I   P(x,  in,  t)dt  (2-10) 

->'to 

Xn+i(to)  =  0  (2-11) 

i^^l(t)  =  F(x,  m,  t)  (2-12) 

Now  the  problem  becomes  the  problem  of  minimizing  the 
(n+1)*^  co-ordinate,  x^+i(t)»  a*  *iie  terminal  of  the  trajectory 
t  —  Xi-p , 

The  optimal  control  problem  discussed  above  may  be  con- 
sidered as  a  special  case  of  the  more  general  problem  of  maxi- 
mizing or  minimizing  the  Pontryagin  function 

(P  =  E  b.x.  (t^)  (2-13) 

i=l 

This  can  be  written  in  a  vector  form 

<P  =  (b,  x(tf )  )  =  b'x(tf.)  (2-11^.) 

where  x  is  a  state  vector  of  n  -order  control  process  under 
consideration,  and  b  is  a  column  vector  which  depends  upon  the 
co-ordinates  to  be  minimized  or  maximized.   A  simple  geometrical 
interpretation  of  the  maximum  principle  is  that  the  control 
vector  m  is  chosen  in  such  a  way  that  the  state  vector  x^^f) 
moves  "farthest"  in  the  direction  of  "-b",  and  thus  the  Pon- 
tryagin function  (P    takes  on  a  minimum  value.   In  optimal  con- 
trol problems  the  final  state  of  optimal  trajectory  may  be 
either  free  or  constrained.   Equation  (2-li|.)  is  unconstrained. 
If  the  final  state  of  process  is  constrained  by 
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Rk  U(tf)]  =  0  (2-15) 

the  Pontryagln  function  takes  the  form 

(P  =  b'x(tf)  +  u'R  x(t|.)  (2-l6) 

where  u  is  a  vector  Lagrange  multiplier. 

Frequently,  the  extremization  of  the  Pontryagin  function 
is  not  easy  to  accomplish.   Pontryagin  first  discovered  the  sim- 
plicity of  the  Hamiltonian  function  and  its  very  nature  makes 
it  tempting  to  think  that  maximization  of  the  Pontryagin  func- 
tion, and  the  use  of  the  Hamiltonian,  may  lead  simply  to  elegant 
methods  for  solving  optimization  problems. 

Now,  the  maximxim  (or  minimum)  principle  states  that,  if  the 

control  vector  m  is  optimum,  i.e.,  if  it  minimizes  (or  maximizes) 

the  Pontryagin  function  a>  ,  then  the  Hamiltonian  H(x,  £,  m,  t) 

is  maximized  (or  minimized)  with  respect  to  m  over  the  control 

interval.   The  Hamiltonian  is  defined  as 

n 
H(x,  £,  m,  t)  =  H  P.  f.  (2-17) 

j=l  -^  ^ 

where  x  is  the  state  vector,  P  is  the  momentum  vector  defined 
as  the  solution  to  the  differential  equation 

p.  =  -  E  p.  -^  (2-18) 

j=l   ^  axi 

where  Pj^(t^)  =  -b^,   i  =  1,  2,  .  .  .,  n  (2-19) 

h^   being  some  known  constant  specified  in  the  Pontryagin  func- 
tion in  equation  (2-13). 

If  we  differentiate  equation  (2-1?)  with  respect  to  p. 
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an 


(2-20) 


»Pi 


Differentiate  equation  (2-17)  with  respect  to  x^^ 

^H     n     afi  ,r.  ^.^ 
=  5:  P^  _JL                              (2-21) 

©X^    j=l     3X^ 

i  =  1,  2,  .  .  . ,  n 
Comparing  equations  (2-1),  (2-l8)  with  equations  (2-20),  (2-21) 
we  obtain  the  Hamiltonian  canonical  form 

i  =  (2-22) 

aPi 

P^  = (2-23) 

These  canonical  equations  are  subject  to  the  boundary  condition 
on  Xj^(tQ)  and  Pj^(tjp);  that  is, 

x.(to)  =  x.O  (2-21+) 

and 

Pi(tf)  =  -b^  (2-25) 

i  =  1,  2,  .  .  . ,  n 
The  physical  interpretation  of  the  maximum  principle  may  be 
stated  that  the  Hamiltonian  H  is  the  inner  product  of  P  and  f , 
or  that  of  P  and  x»  which  represents  the  power  when  P  is  identi- 
fied as  the  momentum.   Thus  to  minimize  6>  ,    the  power  is  maxi- 
mized, and  when  (P   is  minimum.,  H  is  a  maximum. 
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2.   Proof  of  the  Maximum  Principle 

Proof  la  initiated  with  determination  of  a  variation  of 
the  Pontryagin  function,  6   ,  due  to  a  variation  of  state  var- 
iation, oxi,  and  a  change  in  control  signal,  6m.   Assume  that 
the  n^-'^-order  control  process  is  characterized  by 

X  =  fU,  m,  t)  (2-26) 

where  x  is  n-state  vector,  and  m  is  r-control  vector.   The  var- 
iation 5(P  as  a  function  of  6m  and  6Xj^  is  first  derived.   To 
begin  with  the  following  summation  is  formed: 

t-    ^i  °^i  ^^'^^^ 

i=l 

Taking  the  derivative  of  this  sumniation  with  respect  to  t 

d    n  n  n   . 

—   E  PiSx.  =  H  Pi6x..  +  51  P^SXi  (2-28) 

dt   i=l         i=l     ^   i=l 

Integrating  both  sides  of  equation  (2-28)  from  t^  to  t^,  and 

simplification  leads  to 
tf    ^tf 

_^   .  ^  ^ *) 


n 

n  Pi5x. 

i=l 


/^f 

xtf 

-  fi(x,m,t) 1  dt  +       51   Pi5x^       (2-29) 
^  ^to    i 

Since  Pj^(t^)  =  -b^  and  Xj|^(tQ)  =  0.   The  left  side  of  equation 

(2-29) 

tf 

=  -  ^bi5xi(tf)  =  -5(P  (2-30) 


Z    ^i^H 


to 


Thus  the  variation  of  the  Pontryagin  function  <P  due  to 
change  in  x^   and  m  is 


2k 


Jto   i 

E  PiSx^dt 

^■h^    i 


(2-31) 


A  Taylor  series  expansion  of  the  Integrand  with  respect  to 
X  is  executed.   Omitting  the  higher  order  terms  and  using  the 
relationship  of  equations  (2-18)  and  (2-23),  and  equation 

(2-31),  one  obtains 

f^t  afi(x,m,t) 

6tf>  =     2_   L  Pi ^H^^ 

^to   i    '^  ^H 

T.    Pi  j  i"i(^»I!l+5m,  t)  -  fj^(x,m,t) 

■to   ^ 

df^  (x,  m+5in,  t) 


J 


»x 


J 


1          ^2f  .(x+eSx,  m+6m,  t) 
+  _  E  H   =^ 3x-  axjj-j-  dt 


2   j   k         ax-  ax;^ 


■1 


(2-32) 


Rearrangement  yields 
tf 


b(? 


afi(x,    in+5in,    t)      afi(x,x)i,t)  ' 


to 

tf 


m?! 


ax 


axj      ^^ 


5x^dt 


1  /^*f 

-J     E  n  z:  Pi 

2  A.        i        J        k 


3^f  jl^(x+e6x,    m+5m,    t) 


^x  ■«  2>Xj^ 


QxjSxjj-dt 
(2-33) 


In  viexv   of   equation    (2-1?)    and   defining  R   as 


R=         L  L    Pi- 
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>Xj 


■   ox^dt 


1     /*f  >2f    (x+e5x,   m+6xn,    t) 

+  >       r  r  r  Pi  — Qx^^x^dt 

6(P    may  be   expressed  as 

6(P     =  _  (H(x,P,in+5m,t)    -  H(x,P,in, t)  J  dt   -  R  (2-35) 

First,  we  want  to  show  the  necessary  condition  which  states 
that  if  the  Hamiltonian  H  is  not  a  maximuin,  the  minimum  condi- 
tion for  tf>  is  violated.   The  condition  for  the  Pontryagin  func- 
tion IP    to  be  the  minimum  for  any  small  change  5m  of  control 
vector  ra  is 

oJ>  ^  0  (2-36) 

Sov  assume  that  the  maximum  condition  for  H  is  not  satis- 
fied during  a  small  interval  (t^,  t^)  which  lies  within  the 
interval  (tg,  t^.) .   Then  for  any  small  variation  6ra  of  control 
vector  m 

H(x,  P,  m+6m,  t)  -  H(x,P,m,t)  >  €  (2-37) 

where  t  lies  within  the  interval  (tg,  t^) ,  and  €  is  a  positive 
constant.   A  control  vector  m  having  the  following  properties 
is  chosen.   During  the  interval  (t^,  t^^) ,  m  may  be  varied  by  a 
very  small  amount  5m,  and  outside  this  interval,  m  remains  un- 
changed.  Thus  equation  (2-35)  becomes 

5^  =  -      (  H(x,P,  m+5m,  t)  -  H(x,P,m,t))  dt  -  R 


26 

J<*b  an 
^  5mgdt  -  R  (2-38) 

Since  both  5m  and  6x  are  very  small,  the  second  term  at  the 

right-hand  aide  of  equation  (2-3I4.)  is  an  infinitesimal  of 

higher  order,  which  may  be  neglected,  and  then 

('^t                         faff.  (x,m+5m,t)  -  fi(x,m,t)]  ^ 
r1       ^5:  pJlii-: : ^- :L[5x,dt   (2-39) 

Jto    i   J     I  ^^j  ) 

A  Taylor   series  expansion  of  fjL(x>   21+5m,    t)    with  respect 
to  m  results   in 

3f^(x,m,t)  2 

f^(x,ra+5m,t)    =  f^(x,m,t)    +    YL    ^^'^g  +  ^      +    •    •    • 

s  dmq 

(2-40) 

Since  6m  is  very  small,  we  can  omit  the  higher  order  terms 

and  then 

/"^f  ^2f  (x,in,t) 

R  =      E  r  Pi SXj.  5mg  dt  (2-1^.1) 

-^to   i   J         »xj»mg 

In  view  of  equation  (2-17),  one  obtains 

R  =      YLH   ^^i  ^°^3  '^^  (2-i|.2) 

-^ta       »Xj»mg 

Combining  equation  (2-38)  and  equation  (2-i|.2)  yields 


=  -I 


■*b  an  a2  H 

6(p     =  -     I  J3     ( 6ms  +     ^   ^^j   6mg)dt  (2-I|.3) 


ta        s        3mg  J      ^Xj*»^3 

which  is  less  than  zero,  since  the  first  term  of  integrand  is 
positive  and  the  value  of  the  second  terra  is  smaller  than  the 
first  term.   This  implies  that  for  this  particular  control  vector, 
the  Pontryagin  function  <J>  is  not  minimum  for  any  variation  6ra 
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of  the  control  vector  ra. 

In  short,  the  above  result  points  out  that  if  the  maximum 
condition  for  H  is  not  satisfied,  the  minimum  condition  for 
may  be  violated.   This  proves  the  necessary  condition. 

For  the  proof  of  sufficient  condition,  let  the  process 
dynamics  be  characterized  by 

Xi(t)  =  H  aj,j^(t)xj^(t)  +  u^Cm)  (2-1+4) 

k=l 

Then  the  Hamiltonian  of  the  system  may  be  expressed  in  the  form 

H  =  E:  5Z   aii,(t)x^(t)Pi(t)  +  H  u.(in)Pi(t)         (2-45) 
i   k  i 

Since  the  first  term  of  equation  (2-45)  is  linear  in  x^ 

and  is  independent  of  m  and  the  second  term  is  independent  of  x« 


^2h 


3Xj  a  ffig 


(2-46) 


So  R  =  0  and  equation  (2-35)  becomes 

5(J>  =  -      (h(x,P,  m+5m,  t)  -  H(x,P,m,t)]  dt         (2-4?) 

Hence  if  the  maximum  condition  holds  for  the  Hamiltonian 
H,  the  integrand  of  equation  (2-47)  is  nonpositive  and  6 <P  is 
nonnegative;  that  is,  the  minimum  condition  for  the  Pontryagin 
function  <P  is  fulfilled.  This  proves  the  sufficiency  con- 
dition. 
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3.   Application  of  the  Maximum  Principle 

Example.   Consider  the  equation  X2  =  m,  where  m  is  a  real 
control  parameter  constrained  by  the  condition  that  m  =  1. 
In  the  phase  co-ordinates 

kj_   =  xg  {2-kQ) 

X2  =  m  (2-1^.9) 

The  problem  is,  "Plow  can  we  get  to  the  origin  (0,  0)  from 
an  initial  state  Xq  in  the  shortest  time?" 

First  we  write  the  Hamiltonian  function 

H  =  f:^  Pifi  =  Pifi  +  P2f2  =  Pi^i  +  P2^2  (2-50) 

Pi  = =  0  (2-51) 

Pi  =  ^1  (2-52) 

C]^  is  a  constant 

p^  = =  -  __  =  -p^  (2>53) 

ax2     ^Xj 

P2  =  C2  -  c-Lt  i2-^li) 

C2  is  a  constant 
Taking  the  condition  -1  ^  m  -  1  into  account, 
m(t)  =  sign  P2(t)  =  sign(c2  -  Cn^t)  .   Since  C2  -  C;]^*  ^^  a  linear 
function  which  changes  sign  at  most  once  on  the  interval 
*C  ~  ^  -  *1»  therefore  for  every  interval  tQ  ^  t  ^  t-|_,  the 
optimal  control  m(t)  is  a  piecewise  constant  function  which  takes 
on  the  values  ±1,  and  has  at  most  t  two  intervals  on  which  it 
is  constant. 
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For  the  case  m  =  1.      From  equation  (2-^9)  we  know 

X2=mt+k=t+k  (2-55) 

k  is   a   constant 


Prom  equation  (2-i|.8) 

^1  "^  J  ^2^*  ^  —  +  kt  +  1 

1         p  2^0 

=-(t+k)+  (1-k)  =-Xp^+R 


(2-56) 


2  2   - 

k,  1,  R  are  constants. 
Thus  the  portion  of  the  phase  trajectory  for  which  m  =  ±1 
ia  an  arc  of  a  family  of  parabolas  shown  in  Fig.  2-2a. 


Fig.  (2-2a). 


Fig.  (2-2b). 


For  the  case  m  =  -1 


^2  =  -t  +  k 


(2-57) 


X]  = +  kt  +  1  = 

2 

= X2  +  R' 

2 


(-t  +  k')  +  R' 

2 


(2-58) 
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The  family  of  parabolas  of  equation  (2-^8)  is  shown  in 
Pig.  (2-2b).   The  phase  points  move  upwards  along  the  parabolas 
of  equation  {2-^6)    since  =  m  =  +1;  and  downwards  along  the 


dt 


dX' 


parabolas  of  equation  (2-58)  since  =  m  =  -1. 

dt 

For  the  case  that  m  is  initially  equal  to  "+1",  and  then 
to  "-1",  the  phase  trajectory  consists  of  two  adjoining  para- 
bolic segments  (Pig.  2-3a)  if  m   =  -1  first  and  m  =  +1  afterwards, 
the  phase  curve  is  shown  in  Pig.  (2-3b). 


Pig.  2-3a. 


Fl«.  2-3b. 


Pig.  2-i^. 
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If  we  combine  Pigs.  (2-3a)  and  (2-3b),  we  obtain  Pig.  (2-ij.). 
Its  physical  meaning  is  that  if  the  phase  point  moves  along  an 
arc  of  the  parabola  (equation  2-58)  which  passes  through  the 
initial  point  Xq,  if  Xq  is  above  the  curve  AOB;  and  along  an  arc 
of  a  parabola  (equation  2-56)  if  Xq  is  below  this  curve.   In 
other  words,  if  the  initial  position  Xq  is  above  the  curve  AOB, 
the  phase  point  must  move  under  the  influence  of  the  control 
m  =  -1  until  it  reaches  the  arc  AO.   At  the  instant  it  arrives, 
the  value  of  m   switches  to  +1  and  remains  at  this  value  until  the 
phase  point  reaches  the  origin.   Ho^i?ever,  if  the  initial  position 
Xq  is  below  AOB,  m   must  equal  +1  until  the  time  it  reaches  the 
arc  BO,  and  at  that  time  the  value  of  m  changes  to  -1. 

DYNAMIC  PROGRAMMING 

1.   Outline 

Dynamic  programming,  developed  by  the  American  mathematician 
Richard  Bellman,  is  a  simple  but  very  powerful  concept  ivhich 
finds  applications  in  the  solution  of  multistage  decision  prob- 
lems.  The  basic  idea  is  the  principle  of  invariant  embedding, 
according  to  which  a  very  difficult  or  unsolvable  problem  is 
embedded  into  a  class  of  simpler  solvable  problems,  so  that  a 
solution  can  be  obtained.   Numerous  applications  of  dynamic  pro- 
gramming techniques  are  possible,  but  here  we  are  interested  in 
its  application  to  optimal  control  problems. 

In  general,  multistage  decision  problems  are  best  solved  by 
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means  of  the  functional  equation  approach,  and  the  functional 
equation  describing  a  multistage  decision  process  can  readily 
be  derived  by  invoking  "the  principle  of  optirnality",  which 
states:   "An  optimal  policy  has  the  property  that  whatever  the 
initial  state  and  initial  decision  arc,  the  remaining  decisions 
must  constitute  an  optimal  policy  with  regard  to  the  state  re- 
sulting from  the  first  decision.'' 

The  principle  of  optirnality,  which  motivates  the  basic 
properties  of  optimal  control  strategies.  Is  based  upon  the  fund- 
amental concept  of  invariant  embedding.   This  concept  implies 
that  to  solve  a  specific  optimum  decision  problem,  the  original 
problem  is  embedded  within  a  family  of  similar  problems  which 
are  easier  to  solve.   For  multistage  decision  processes,  this 
will  allow  the  replacement  of  the  original  multistage  optimiza- 
tion problem  by  the  problem  of  solving  a  sequence  of  single- 
stage  decision  processes,  which  are  simpler  to  handle. 

Let  X  be  k^   state  vector  characterizing  a  physical  system 
at  any  time.   If  the  state  of  the  physical  system  is  transferred 
from  X2  into  ^2  by  the  transformation 

Z2  =  g(2^1>  "^l)  (3-1) 

For  a  single-stage  decision  process,  an  output  or  return  yields, 

Rl  =  r(x-L,  m-^)  (3-2) 

The  problem  is  to  choose  a  decision  m-,  so  as  to  maximize  the 
return.   The  maximum  return  is  given  by 

•^1^-1^  =  max  r(x-j^,  m-j^)  (3-3) 

m 

In  a  two-stage  decision  process,  if  the  state  of  the 

physical  system  is  first  transformed  from  x-]_  into  X2  by 
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equation  (3-1)  and  is  then  transformed  froin  X2  into  Xo  by  the 
transformation. 

Xo  =  E,^^2'   ""2^  (3-4) 

This  sequence  of  operations  results  in  a  total  return. 

Rp  =  r(x-,,  Di-j^)  +  r(x2,  i^g)  (3-5) 

Then  the  optiravun  design  problem  is  to  choose  a  sequence  of  al- 
lowable decisions  m-,  and  mp  so  as  to  maximize  the  total  return. 
The  maximum  return  is  given  by 

^2^— 2^  ~  raax|r(x-j_»  ^1)    +  ^^2^2*  ^2M  (3-6) 

hit  ,m2 

In  general,  for  an  N-stage  decision  process,  the  problem 

is  to  choose  an  N-stage  policy 

jm^,  m2,  m^,  .  .  .,  m^'^ 

so  as  to  maximize  the  total  return 

N 

j=l    ''   •' 
The  maximum  return  of  the  N-stage  process  is  given  by 

f  N  'J 

fl.(x-.)  =  max  ]  H  r(x-.,  m.)f  (3-8) 

m .  I  j=l    •'    -^  ' 
It  is  not  expected  to  obtain  the  solution  of  N  simultaneous 
equation  by  zeroing  the  partial  derivatives  of  the  quantity  in 
the  braces  with  respect  to  m^,  j  =  1,  2,  .  .  .,  n,  but  the  prob- 
lem can  be  solved  by  using  the  principle  of  optimality.   If 

then  the  maximum  return  is  given  by 

^N^-1^  =  max|r(x3^,  ^1^    "^  %-l  {^^^1*   "'l^])'        (3-10) 

Clearly,  by  applying  the  principle  of  optimality,  the  N-stage 
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decision  process  is  reduced  to  a  sequence  of  N  single-stage  de- 
cision processes,  thus  enabling  this  optimization  problem  to  be 
solved  in  a  systematic,  iterative  manner. 

We  should  pay  attention  to  the  fact  that  any  multistage 
decision  process  in  realistic  situations  is  usually  time  de- 
pendent and  stochastic.   In  other  words,  it  is  unreasonable  to 
assume  a  multistage  decision  process  to  be  independent  of  time 
or  to  assume  that  the  process  is  deterministic  in  nature.   Wa 
shall  consider  these  realistic  conditions  in  the  following 
problems. 

2.   Basic  Principle  and  Application 

We  will  now  discuss  the  application  of  dynamic  programming 
technique  to  the  three  basic  types  of  problems — minimum- integral 
control  processes,  terminal-control  processes,  and  miniimam-time 
processes. 

1) .  Minimum- integral  Control  Processes.  Consider  an  n*  - 
order  control  process  characterized  by  the  vector  differential 
equation: 

x(t)  =  £(x,  m,  t)  (3-11) 

where  x  =  an  n-vector  representing  state  of  process 
m  =  an  r-vector  denoting  control  signals 
£  =  a  differentiable  vector  function  of  the  arguments 
X,  m,  t. 
The  initial  conditions  are  given  by 

x(to)  =  Xq  (3-12) 
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Determine  the  optimal  control  vector  m  which  minimizes  the  in- 
tegral criterion  function 


Km) 


J  to 


P(x,  m,    t)dt 


(3-13) 


The  integrand  P(x,  m,  t)  is  a  differentiable  scalar  function  of 
the  state  vector  control  vector  and  time. 


X 

^^t.tt) 

(K.^to) 

y^'^^ 

'       \ 
'       1 

1             1 

1      1 
1      1 
1      1 

•to 

■t    t-t-il 

^> 

Pig.  3-1.   Optimuiri  trajectory. 

In  order  to  apply  the  functional  equation  technique  of 
dynamic  programming,  this  optimization  problem  is  embedded  ivithin 
the  wider  problem  of  minimising 
tf 


I 


P(x,  m,  t)dt 


We  write 


I 


tf 


f(x,  t)  =  min  I   F(x,  m,  t)dt 
m  Jx 


(3-li;) 


where  t  ranges  over  the  interval  (tQ,  t^) ,  the  minimum  is  taken 
over  all  m,  and  f(x,  tQ)  =  f(xo^  ^^  ^  ~  '^'O*   Application  of  the 
principle  of  optimality  reduces  equation  (3-lij.)  to  the  functional 
equation. 
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't+A 


f(x,  t)  =  min  I  I    P(x,ra,'r)d'i:  +  f(x  +  xA  ,  t  +  A  )|    (3-1$) 

By  integration  and  Taylor  series  expansion  one  obtains 

f      f 

f  (x,t)  =  min  P(x,m,t)^  +  f  (x,t)  +x  —  A  +  —  A   +6(A) 
""      a  X      t 

(3-16) 

If  A    approaches  zero,  equation  (3-l6)  becomes 

af  ht 

=  min  P(x,  m,  t)  +  —  £(x,  in,  t)  (3-17) 

at   m  ax 

Prom  equation  (3-17)  the  following  two  equations  result: 

3P   3f  as 

—  + =  0  (3-18) 

arc   9x     am 

af  af 

P  +  —  g(x,  m,    t)  +  —  =  0  (3-19) 

ax  at 

af  af 
Solving  for  —  and  —  from  equation  (3-l8)  and  equation  (3-19), 

ax  at 
respectively,  yields 

af    ap/ain 

—  = —  =  p(x,  m,  t)  (3-20) 

9x         ag/aifl 

^f  af 

—  =  -P(x,  m,    t)    g(x,  m,  t) 

at  ax 

ap/am 

=  -P(x,  m,  t)  +  — -  six,    m,  t)  =-■   Q(x,  m,  t)        (3-21) 

a_g/am 

By  partial  differentiation  of  equation  (3-20)  with  respect  to  t 

and  partial  differentiation  of  equation  (3-21)  with  respect  to 

X,  we  obtain 
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=  _+__=  +  _  _Z  (3-22) 

axat   at   am  at   ax  at 

a^f   aQ   aQ  am 

=  —  +  —  —  (3-23) 

dxat   ax   am  ax 

Combining  equation  (3-22)  and  equation  (3-23), 

ap   ap  am   ap  3x   aQ   aQ  jm 

_  +  _  _  +  _  _  =  _  +  _>.  _  (3-2i|.) 

at   ^m  at   ax  at   ax   am  ax 

Prom  equation  (3-2ij.)  ,  the  optimum  control  m  can  be  determined. 

In  the  foregoing  study,  we  assume  that  no  constraint  is 
imposed  upon  control  signals  and  state  variables.   In  practice, 
because  of  the  physical  limitations  of  controlling  devices, 
physical  constraints  on  control  signal  or  state  variables  must 
be  taken  into  account  in  the  problems  of  optimum  controls.   If 
the  constraints  are  not  considered,  the  design  may  lead  to  a 
system  demanding  excessively  large  control  signals,  which  is  un- 
realistic and  impracticable.   Commonly  encountered  constraints 
on  control  signals  may  be  described  as 

a)  integral  constraint 

H(m)dt  ^  c  (3-25) 

->'to 

b)  amplitude  saturation 


a^ 


i  ^  mi(t)  ^  bi  (3-26) 

tQ   —   t   —   t-P 

where  a^^,  \>^   =  constants 

c^  =  a  constant  vector 

H(m)  =  a  vector  function  of  the  control  signals, 
a)  The  first  kind  of  constraints  can  be  handled  by  the 
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Lagrange  multiplier  method.   (This  was  mentioned  in  part  2,  cal- 
culus of  variations.)   The  optimization  problem  is  transformed 
into  the  problem  of  minimizing  the  synthetic  function. 

Il(in)  =  Km)  +  A'      HU)dt  (3-2?) 

where  l(m)  is  the  specified  performance  index  to  be  minimized. 
X«  is  the  transpose  vector  Lagrange  multiplier.   After  the  mini- 
mization of  the  synthetic  function  is  achieved,  the  desired  con- 
trol vector  results  as  a  function  of  the  vector  Lagrange  mul- 
tiplier \.   Substitution  of  the  vector  m(X)  into  equation  (3-25) 
leads  to  r  equations  which  can  be  solved  for  the  elements  X^   of 
the  vector  Lagrnnge  multiplier. 

b)  When  constraints  of  the  second  type  are  considered,  the 
analytical  solutions  of  partial  differential  equations  and  the 
Euler-Lagrange  equation  which  will  appear  in  the  optimum  problems 
are  extremely  difficult  to  obtain.   To  circumvent  these  diffi- 
culties, the  following  functional  equations  will  be  used: 

f(x,  t)  =  min[      P(x,  H»  t)dt  +  f(x,  +  xa  ,  t  +  A  )] 
m  Ut  ^ 

=  min  [  F(x,  m,  t)    +  f  (x  +  gA  ,  t  +  A  ))      (3-28) 
m 

where  the  control  signal  satisfies  the  constraint  |m|  -  M,  and 
A  is  a  predetermined  small  interval  of  time.  An  N-stage  com- 
putational process  yields 

fjj..(x,t)  =  min  P(x,m,t)A  +  fjj_(  j^.-^^  (x  +  g^  ,  t  +  A  ) 

(3-29) 
with  t  =  to  +  j   ,  j  =  0,  1,  2,  .  .  .,  N 

■tf  =  to  +  NA 
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For  j=0   %(x,tQ)  =  itiin  [P(x,m,to)  +  fK_i(Z+gA  ,  tQ+A)] 

m 

(3-30) 

j=M   fQ{x,  t^)  =  0  (3-31) 

j=n-l   r-i(x»  tf-^)  =  min  F(x,m,  t^-A)   +  fQ(x,t^) 

in 

=  min  F(x,  m,  t^  -A)  A  (3-32) 

ra 

and 

f2(x,  tf  -    2  A)  =  min  F(x,m,  t-2A)A+  f-j^(x,  t^  -  A  )    (3-33) 

in 

Prom  equation  (3-32)  and  equation  (3-33),  the  values  of 
f-j^(x,  t^  -A)  and  the  corresponding  m^,  f2^2E»  *f  ~  ^^^  ^"^  ^^® 
corresponding  m2  are  determined  for  successive  values  of  x» 
The  values  of  fi(x»  t^  -A)  may  be  obtained  from  the  foregoing 
either  directly  or  by  interpolation  or  extrapolation.   By  this 
thod,  the  optimum,  control  signal  m  which  minimizes  the  speci- 


fied integral  criterion  function  can  be  ascertained. 

2)  Terminal-control  Processes.   Consider  the  n*  -order  con- 
trol process  characterized  by  the  differential  equation 

x(t)  =  Ax(t)  +  Dm(t)  (3-34) 

with  the  initial  conditions  given  by  x(0)  =  Zq*  ^^   equation 
(3-3ij-)  X  is  an  n-vector  representing  the  state  of  the  control 
process,  m  is  an  r-vector  representing  the  control  signals  sub- 
jected to  the  following  constraints  during  the  interval 


0  ^  t  ^  T: 


hid  ^Mj,  {3-3S) 

/T 

J   H(m)dt  ^  c  (3-36) 


ko 


A   and  D  are  the  coefficient  and  the  driving  matrices,  respec- 
tively.  The  optimum  control  vector  m(t)  is  to  be  determined  so 
as  to  minimize  the  criterion  function. 

Km)  =  G  [x-j_(T),  X2(T),  .  .  .,  x-^(T))  (3-37) 

It  has  been  found  that  the  solution  to  equation  (3-314-)  is 

x(t)  =  z(t)  +1   w(t  -  t)  m{%)    dx  (3-38) 

where  ^(t)  =  0[t)    X(0)  =  complementary  solution  and 

w(t  -  'T)m('c)dx  =  particular  solution 

w(t  -  x)  =  9i{t   -   x)D  =  e^*"'^^D 

In  terms  of  vector  components,  the  state  variables  of  the 
control  process  are  given  by 

f^.   r 

(t)  =  Zi(t)  +     II  ^ik^t  -  'c)mk('c)]  d-u  (3-39) 

Jo  if=l 

i  =  1,  2,  .  .  .,  n 
The  terminal  control  probleBi  m.ay  now  be  restated  as  deter- 
mination of  the  optimum  control  vector  m(t)  that  minimizes  the 
function 

Km)  =  G  z-l(T)  +  J    Z!  «; 

Z;i^(T)  +  J   YL    wj^k^*  '  '^)"ik^'^^^'^j  (3-1^0) 

subject  to  constraints  given  in  equation  (3-35)  and  equation 
(3-36). 

Application  of  the  Lagrange  multiplier  converts  this 


Xil 


'ik 


(t  -  t)m,  (T)dT  .  .  .  . 
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optimization  problem  into  the  minimization  of  the  synthetic 
function 


l-,{m)    =  G  + 


?.     H(m 


jdT 


(3-41) 


with  respect  to  m,  which  is  now  constrained  only  by  equation 
(3-35).   Let  the  minimum  value  of  I-j^  be  denoted  by 
f(z^,  z^,    .    .    .,    z^,  T) .   Then 

f(zT,  Z2,  .  .  .,  2]^,  T)  =  mini  G  rz-j^(T) 
/T 

+      E  w-,^j^(T-T)mjj.('c)d'c]  +  X  H(m^,  mg,  ...,mj,)d'€l 


(3-42) 


L  w^^(T-T)m^(T)d'U 


'0   k 
which  may  be  rewritten  as 

f(z2^,Z2,  ...,z-L,T)  =  min  I  G  I  z-j^(T)  + 

+    ZI  w^^(T-T:)nii^,(T:)di:  +  ...  z-j_(T) 
Ja  k 

+  J   ^  w;Lk(^-'^)^k^'^^'^'^  "*"  I    ^  W3^j^(T-'Tr)mj^(T)dT: 
Jq        k  -/a   K  J 

+  X     H(m-,  jmp, . . .  ,m^)dT  +  I   H(m-,,  mg, .  . .  ,m  )  dT  •  (3-43) 

Jo  J^ 

where  ^  is  a  very  small  time  interval.   By  changes  of  limits, 

f(z-j^,Z2,...,z^,T)  =  min  J  G  rz-j^(T)  +  j    JI  w^^(T-T)m  (T)dT 

mj.  I   k        J^       k 


k2 


JrT-6. 
0   k 
■A  /T-A 

+ 


f  f  >i 

I     E  W3j^(T-T)mj^(T)dT  +  J         W^^(T-T-A  )m^(T+A)dTj 

.A  /T-A  . 

+  X  I   H(ra-,m2,...,m  )dT  +  ^  I    H(m, ,m2, . . . ,m^)dTj 

By  the  principle  of  optimallty,  the  functional  equation 
for  terminal  control  problem  is  found  to  be 


(z^jZg,  ...,  z^,T)  =min|f  [z^+  J   £;  w^^{T-'c)m^(T)dT, .  . .  z^ 
+  I   E  Wii5.(T-T)rcj^(T)dT;   T-a)+A  J   H(m,  ,m2, . .  .  ,m^)  dTV 


(3-45) 


where  the  minimum  is  taken  over  all  m^^  defined  over  the  interval 
(0,A)  and  satisfying  constraints  given  in  equation  (3-35).   An 
analytical  solution  of  the  terminal  control  problem  is  not  easy 
to  derive,  but  by  approximating  the  following  recurrence  rela- 
tionship we  may  obtain  the  solution. 


f  (z^,Z2,...,z-,^,T)  =  min  I  f  (  z-j^  +  aEI  w^^(T)m^,  . . .  ,z^ 

-^Z!  W3_j^(T)mjj.;   T-A)+  \   H(m-,^,ra2, . . .  ,mp)l  (3-46) 

k  J 

with 

f(z^,Z2,...,z-j^;  0)  =g[z-j^(0),  Z2(0),  ...,  2^(0))       (3-ii.7) 

3)  Minimum- time  Control  Process.   Consider  the  nonlinear 
control  process  characterized  by  the  vector  differential 
equation 
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X  =  g(x,  m,  t)  O-i+S) 

where  x  is  n-state  vector  and  m  Is  r-vector.   Determine  the 
optirnum-contrcl  strategy  which  will  transform  the  process  from 
a  given  initial  state 

x(to)  =  xq  (3-1^9) 

to  the  desired  final  state. 

x(tf)  =  x^f  (3-50) 

in  minimum  time,  and  the  final  time  t^  is  unspecified. 

Let  the  function  of  the  minimum  time  to  transform  the  pro- 
cess from  the  state  x(t)  to  the  desired  final  state  x(t^)  with 
the  control  vector  optimally  chosen  be 

f(x,  t)  =  mln  {tf  -  to}  (3-51) 

m 

On  the  basis  of  the  definition  of  f(x,  t),  it  follows  that  at 

the  end  of  the  optimal  trajectory 

■9t 


7>t 
and 

af 


=  0  (3-52) 

t=tf 


ax^ 


=  0  (3-53) 

t=tf 

By  the  principle  of  optima lity,  the  minimum  time  is  given 
by  the  functional  equation. 

f(x,  t)  min  I  A  +  f(x  +  x a ,  t  +  A  )}  {3-Sk) 

m 

Expanding  f(x  +  xA,  t  +A)  into  Taylor  series  and  simplifying 
as  before,  we  obtain 

.    af  af        . 

min  \^    +   —  g(x,  S*  t)^  +  —  ^  +  €  (  Z!i)l  =  O     iZSS) 
m  ^      »x  ~  <^t  ' 


kk 


If  A  approaches  zero  as  a  limit,  equation  (3-55)  becomes 

. 3f  af  V 

min  I  —  g(x,  m,    t)  +  _  =  -1  (3-56) 

m  "^  ax  ?t  >* 

Prom  equation  (3-5^)  and  from  arguments  similar  to  those  for 

deriving  equations  (3-l8)  and  (3-19),  obtain  the  following  two 

equations: 

Bf  ^g(x,  m,  t) 


=  0  (3-57) 


ax     am 


af  «f 

—  g(x,  m,  t)  +  _  +  1  =  0  i3-^Q) 

JX  3t 


Furthermore,  if  the  function  g  is  time-independent,  equation 
(3-58)  can  be  reduced  as 

^f 

—  g(x,  m)  +  1  =  0  (3-59) 

zx  ~ 

3f 

when  the  partial  derivative  is  known  at  a  point,  equation 

ax 

(3-57)  and  equation  (3-59)  can  be  solved  for  the  optimal  con- 
trol vector  ra. 
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SUMMARY 


Among  the  many  optimum-control  problems,  three  basic  types 
are  of  fundamental  importance.   They  are  the  minimum-time  con- 
trol problem,  the  terminal-control  problem,  and  the  minimum- 
integral  control  problem.   Based  on  the  three  kinds  of  problems, 
we  have  discussed  the  optimal-control  system  by  the  calculus  of 
variations,  the  maximaxn   principle,  and  dynamic  programming. 
In  discussing  the  calculus  of  variations,  the  minimum- integral 
control  problem  is  classified  as  the  Lagrange  type,  the  minimum- 
time  control  and  the  terminal-control  problems  are  classified 
as  Mayer  problem. 

The  optimum  design  of  control  systems  by  the  calculus  of 
variations  generally  leads  to  a  two-point  boundary  value  prob- 
lem.  Analytical  solutions  for  such  problems  are  possible  only 
in  special  cases.   In  view  of  the  fact  that  the  resulting  Euler- 
Lagrange  differential  equations  are  usually  nonlinear,  numerical 
trial-and-error  techniques  must  be  resorted  to.   These  techniques 
employ  an  initial  value  for  the  missing  initial  condition  and 
integrating  numerically  the  Euler-Lagrange  equations  and  con- 
straining equations.   This  work  belongs  to  applied  mathematics 
and  numerical  analysis,  and  was  not  discussed  here.   The  diffi- 
culty in  solving  a  two-point  boundary  value  problem  m.akes  the 
classical  calculus  of  variations  less  attractive  in  the  design 
of  an  optimal-control  system.   Further,  the  variational  calculus 
approach  is  generally  limited  to  systems  subject  to  control 
signals  with  unrestricted  bounds. 


1+6 

The  maximum  principle  of  Pontryagin  provides  an  elegant 
method  of  obtaining  an  optimal  solution  for  very  general  dynam- 
ical processes.   It  treats  the  optimization  problem  of  maximiz- 
ing or  minimizing  a  function  subject  to  certain  constraints. 
In  general,  a  new  state  variable  x  ^.-.  is  introduced  to  convert 
the  optimum  control  problem  to  the  optimization  of  this  new  co- 
ordinate, and  the  Pontryagin  function  (?  -   b'x(t-)  subject  to 
certain  constraints  is  used  for  this  new  co-ordinate. 

In  general,  the  maximum  principle  provides  a  necessary  con- 
dition for  system  optimization.   However,  if  the  control  process 
is  linear  and  subject  to  an  additive  control  function,  it  pro- 
vides the  necessary  and  sufficient  condition  for  optimum  con- 
trol.  Although  the  application  of  the  maximum  principle  is  not 
restricted  to  systems  with  unbounded  control  signals,  it  is  sub- 
ject to  the  same  difficult  two-point  boundary  value  problem  in 
the  variational  calculus. 

The  basic  theory  of  dynamic  programming  is  the  principle 
of  optiraality  and  the  functional  equation  approach.  Following 
the  formal  analysis,  the  optimal  control  problem  can  be  reduced 
to  the  determination  of  the  solution  of  the  Hamilton-Jacobi 
equation.   The  functional  equation  approach  of  dynamic  program- 
ming provides  a  way  of  obtaining  the  computational  solution  of 
optimization  problem  which  does  not  depend  upon  the  solution  of 
the  partial  differential  equation,  thus  circumventing  difficul- 
ties with  the  two-point  boundary  value  problem.   The  principle 
of  optimality  is  applied  to  the  derivation  of  the  partial  dif- 
ferential equations  describing  the  optimal  control  signals. 


kl 


Constraints  on  control  signals  are  considered  In  the  optimuin 
design.   For  control  processes  of  moderate  complexity,  a  solu- 
tion to  the  partial  differential  equation  is  generally  difficult 
to  derive,  and  resort  is  often  made  to  numerical  analysis 
through  the  functional  equation  approach.   The  constraint  on 
the  control  signal  defines  a  finite  range  of  possible  values, 
and  this  makes  the  computation  easier. 
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Optimal  control  problems  are  viewed  as  variational  problems. 
Three  basic  variational  methods  are  discussed  for  maximizing  or 
minimizing  a  functional  over  a  function  space.   They  are: 

1.  The  calculus  of  variations 

2.  The  maximum  principle 

3.  Dynamic  programming. 

Three  basic  problems  in  optimal  control  systems  are  stated 
as  typical  problems.   They  are: 

1.  The  minimum-time  control  problem 

2.  The  terminal-control  problem 

3.  The  minimum-integral  control  problem. 

In  all  cases  the  goal  is  to  find  the  optimum  control  law 
or  sequence  such  that  the  given  function  of  the  perfoirmance 
indices  is  maximized  or  minimized.   In  realistic  and  practical 
situations,  physical  constraint  on  control  signals  or  state 
variables  must  be  taken  into  account  and  these  make  optimum 
control  problems  more  cornplicated. 


